Loading data from a file
Problem
You want to load data from a file.
Solution
Delimited text files
The simplest way to import data is to save it as a text file with delimiters such as tabs or commas (CSV).
data <- read.csv("datafile.csv") # Load a CSV file that doesn't have headers data <- read.csv("datafile-noheader.csv", header=FALSE)
The function read.table()
is a more general function which allows you to set the delimiter, whether or not there are headers, whether strings are set off with quotes, and more.
data <- read.table("datafile-noheader.csv", header=FALSE, sep= "," # use "\t" for tab-delimited files )
Data can also be loaded from a URL. These (very long) URLs point to the attachments on this page.
data <- read.csv("http://wiki.stdout.org/rcookbook/Data%20input%20and%20output/Loading%20data%20from%20a%20file/datafile.csv") # Read in a CSV file without headers data <- read.csv("http://wiki.stdout.org/rcookbook/Data%20input%20and%20output/Loading%20data%20from%20a%20file/datafile-noheader.csv", header=FALSE) # Manually assign the header names names(data) <- c("subject","sex","control","cond1","cond2")
The data files used above:
"subject","sex","control","cond1","cond2" 1,"M",7.9,12.3,10.7 2,"F",6.3,10.6,11.1 3,"F",9.5,13.1,13.8 4,"M",11.5,13.4,12.9
1,"M",7.9,12.3,10.7 2,"F",6.3,10.6,11.1 3,"F",9.5,13.1,13.8 4,"M",11.5,13.4,12.9
Fixed-width text files
Suppose your data has fixed-width columns, like this:
subject sex control cond1 cond2 1 M 7.9 12.3 10.7 2 F 6.3 10.6 11.1 3 F 9.5 13.1 13.8 4 M 11.5 13.4 12.9
One way to read it in is to simply use read.table()
as above, which will treat consecutive spaces as a single delimiter.
read.table("clipboard", header=TRUE, strip.white=TRUE)
However, your data file may have columns containing spaces, or columns with no spaces separating them, like this, where the scores column represents six different measurements, each from 0 to 3.
subject sex scores N 1 M 113311 NE 2 F 112231 S 3 F 111221 W 4 M 011002
In this case, you may need to use the read.fwf()
function. If you read the column names from the file, it requires that they be separated with a delimiter like a single tab, space, or comma. If they are separated with multiple spaces, as in this example, you will have to assign the column names directly.
# Assign the column names manually read.fwf("myfile.txt", c(7,5,-2,1,1,1,1,1,1), # Width of the columns. -2 means drop those columns skip=1, # Skip the first line (contains header here) col.names=c("subject","sex","s1","s2","s3","s4","s5","s6"), strip.white=TRUE) # Strip out leading and trailing whitespace when reading each # subject sex s1 s2 s3 s4 s5 s6 # N 1 M 1 1 3 3 1 1 # NE 2 F 1 1 2 2 3 1 # S 3 F 1 1 1 2 2 1 # W 4 M 0 1 1 0 0 2 # If the first row looked like this: # subject,sex,scores # Then we could use header=TRUE: read.fwf("myfile.txt", c(7,5,-2,1,1,1,1,1,1), header=TRUE, strip.white=TRUE)
Excel files
See http://wiki.r-project.org/rwiki/doku.php?id=tips:data-io:ms_windows.